Search CORE

172 research outputs found

Design Optimization of Tank Track Pad Meta-Material Using the Cell Synthesis Method

Author: Kulkarni Neehar Milind
Publication venue: Clemson University Libraries
Publication date: 01/05/2016
Field of study

The elastomeric backer pad on the M1 Abrams tank track experiences highly cyclic and dynamic loads during normal operating conditions. As a result, extensive heat is generated within the pad due to its viscoelastic hysteretic nature which leads to its early failure. Research has been carried out in the past at Clemson University to design a meta-material that will mimic the deformation behavior of the elastomeric backer pad but will be made out of a linearly elastic constitutive material to eliminate hysteresis. A meta-material in this context is an artificial material in the form of a periodic structure that exhibits effective properties that differ from its constitutive material. Previous attempts to design a feasible meta-material as an effective replacement to the existing elastomeric backer pad have been unsuccessful. The work carried out in this research therefore, is focused on developing a meta-material that satisfies all the application specific requirements. The meta-material is designed based on the steps prescribed by the Unit Cell Synthesis Method which was developed in previous research. Using this method, a unit cell based periodic meta-material can be designed that exhibits nonlinear deformation behavior by implementing various combinations of different elemental geometries that show geometric nonlinearity under deformation. The idea is to attain a targeted nonlinear deformation response of the meta-material structure by tuning the geometric nonlinearities of one or multiple entities in order to replace the material nonlinearity of the target material. A modification is proposed to the original method to make it more efficient by introducing a multi-objective optimization step that considers all the relevant feasibility criteria concerning the meta-material design. Two unit cell based meta-material concepts are evaluated and a best meta-material design is chosen based on the results obtained from the multi-objective optimization problem. The optimized meta-material is then subjected to dynamic tank wheel roll-over conditions to compare its deformation response with that of the original pad. Finally, conclusions are drawn and scope for future work is discussed

Clemson University: TigerPrints

Scheduling Transformation and Dependence Tests for Recursive Programs

Author: Kulkarni Milind
Sundararajah Kirshanthan
Publication venue: 'Purdue University (bepress)'
Publication date: 01/11/2018
Field of study

Scheduling transformations reorder the execution of operations in a program to improve locality and/or parallelism. The polyhedral model provides a general framework for performing instance-wise scheduling transformations for regular programs, reordering the iterations of loops that operate over dense arrays through transformations like tiling. There is no analogous framework for recursive programs—despite recent interest in optimizations like tiling and fusion for recursive applications. This paper presents PolyRec, the first general framework for applying scheduling transformations—like inlining, interchange, and code motion—to nested recursive programs and reasoning about their correctness. We describe the phases of PolyRec—representing dynamic instances, applying transformations, reasoning about correctness—and show that PolyRec is able to apply sophisticated, composed transformations to complex, nested recursive programs and improve performance through enhanced locality

Purdue E-Pubs

RT-kNNS Unbound: Using RT Cores to Accelerate Unrestricted Neighbor Search

Author: Kulkarni Milind
Mandarapu Durga
Nagarajan Vani
Publication venue
Publication date: 26/05/2023
Field of study

The problem of identifying the k-Nearest Neighbors (kNNS) of a point has proven to be very useful both as a standalone application and as a subroutine in larger applications. Given its far-reaching applicability in areas such as machine learning and point clouds, extensive research has gone into leveraging GPU acceleration to solve this problem. Recent work has shown that using Ray Tracing cores in recent GPUs to accelerate kNNS is much more efficient compared to traditional acceleration using shader cores. However, the existing translation of kNNS to a ray tracing problem imposes a constraint on the search space for neighbors. Due to this, we can only use RT cores to accelerate fixed-radius kNNS, which requires the user to set a search radius a priori and hence can miss neighbors. In this work, we propose TrueKNN, the first unbounded RT-accelerated neighbor search. TrueKNN adopts an iterative approach where we incrementally grow the search space until all points have found their k neighbors. We show that our approach is orders of magnitude faster than existing approaches and can even be used to accelerate fixed-radius neighbor searches.Comment: This paper has been accepted at the International Conference on Supercomputing 2023 (ICS'23

arXiv.org e-Print Archive

D2P: Automatically Creating Distributed Dynamic Programming Codes

Author: Chang Qifan
Hegde Nikhil
Kulkarni Milind
Publication venue: 'Purdue University (bepress)'
Publication date: 31/10/2018
Field of study

Dynamic Programming (DP) algorithms are common targets for parallelization, and, as these algorithms are applied to larger inputs, distributed implementations become necessary. However, creating distributed-memory solutions involves the challenges of task creation, program and data partitioning, communication optimization, and task scheduling. In this paper we present D2P, an end-to-end system for automatically transforming a specification of any recursive DP algorithm into distributed-memory implementation of the algorithm. When given a pseudo-code of a recursive DP algorithm, D2P automatically generates the corresponding MPI-based implementation. Our evaluation of the generated distributed implementations shows that they are efficient and scalable. Moreover, D2P-generated implementations are faster than implementations generated by recent general distributed DP frameworks, and are competitive with (and often faster than) hand-written implementations

Purdue E-Pubs

Generalized Neighbor Search using Commodity Hardware Acceleration

Author: Kulkarni Milind
Mandarapu Durga
Nagarajan Vani
Publication venue
Publication date: 15/11/2023
Field of study

Tree-based Nearest Neighbor Search (NNS) is hard to parallelize on GPUs. However, newer Nvidia GPUs are equipped with Ray Tracing (RT) cores that can build a spatial tree called Bounding Volume Hierarchy (BVH) to accelerate graphics rendering. Recent work proposed using RT cores to implement NNS, but they all have a hardware-imposed constraint on the type of distance metric, which is the Euclidean distance. We propose and implement two approaches for generalized distance computations: filter-refine, and monotone transformation, each of which allows non-euclidean nearest neighbor queries to be performed in terms of Euclidean distances. We find that our reductions improve the time taken to perform distance computations during the search, thereby improving the overall performance of the NNS

arXiv.org e-Print Archive

Tribochemical investigation of microelectronic materials

Author: Kulkarni Milind Sudhakar
Publication venue
Publication date: 02/06/2009
Field of study

To achieve efficient planarization with reduced device dimensions in integrated circuits, a better understanding of the physics, chemistry, and the complex interplay involved in chemical mechanical planarization (CMP) is needed. The CMP process takes place at the interface of the pad and wafer in the presence of the fluid slurry medium. The hardness of Cu is significantly less than the slurry abrasive particles which are usually alumina or silica. It has been accepted that a surface layer can protect the Cu surface from scratching during CMP. Four competing mechanisms in materials removal have been reported: the chemical dissolution of Cu, the mechanical removal through slurry abrasives, the formation of thin layer of Cu oxide and the sweeping surface material by slurry flow. Despite the previous investigation of Cu removal, the electrochemical properties of Cu surface layer is yet to be understood. The motivation of this research was to understand the fundamental aspects of removal mechanisms in terms of electrochemical interactions, chemical dissolution, mechanical wear, and factors affecting planarization. Since one of the major requirements in CMP is to have a high surface finish, i.e., low surface roughness, optimization of the surface finish in reference to various parameters was emphasized. Three approaches were used in this research: in situ measurement of material removal, exploration of the electropotential activation and passivation at the copper surface and modeling of the synergistic electrochemical-mechanical interactions on the copper surface. In this research, copper polishing experiments were conducted using a table top tribometer. A potentiostat was coupled with this tribometer. This combination enabled the evaluation of important variables such as applied pressure, polishing speed, slurry chemistry, pH, materials, and applied DC potential. Experiments were designed to understand the combined and individual effect of electrochemical interactions as well as mechanical impact during polishing. Extensive surface characterization was performed with AFM, SEM, TEM and XPS. An innovative method for direct material removal measurement on the nanometer scale was developed and used. Experimental observations were compared with the theoretically calculated material removal rate values. The synergistic effect of all of the components of the process, which result in a better quality surface finish was quantitatively evaluated for the first time. Impressed potential during CMP proved to be a controlling parameter in the material removal mechanism. Using the experimental results, a model was developed, which provided a practical insight into the CMP process. The research is expected to help with electrochemical material removal in copper planarization with low-k dielectrics

Texas A&M Repository

Stratified Online Sampling for Sound Approximation in MapReduce

Author: . Nitin
Kulkarni Milind
Thottethodi Mithuna
Vijaykumar T.N.
Publication venue: 'Purdue University (bepress)'
Publication date: 05/11/2015
Field of study

Purdue E-Pubs

Efficient GPU Tree Walks for Effective Distributed N-Body Simulations

Author: Kulkarni Milind
Liu Jianqiao
Quinn Thomas
Robson Michael
Publication venue: Smith ScholarWorks
Publication date: 26/06/2019
Field of study

N-body problems, such as simulating the motion of stars in a galaxy, are popularly solved using tree codes like Barnes-Hut. ChaNGa is a best-of-breed n-body platform that uses an asymptotically-efficient tree traversal strategy known as a dual-tree walk to quickly determine which bodies need to interact with each other to provide an accurate simulation result. However, this strategy does not work well on GPUs, due to the highly-irregular nature of the dual-tree algorithm. On GPUs, ChaNGa uses a hybrid strategy where the CPU performs the tree walk to determine which bodies interact while the GPU performs the force computation. In this paper, we show that a highly-optimized single-tree walk approach is able to achieve better GPU performance by significantly accelerating the tree walk and reducing CPU/GPU communication. Our experiments show that this new design can achieve a 8.25× speedup over baseline ChaNGa using a one node, one process per node configuration

Crossref

Smith College: Smith ScholarWorks